Online Reinforcement Learning Using a Probability Density Estimation

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Reinforcement Learning Using a Probability Density Estimation

Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to domin...

متن کامل

Reinforcement Learning for Robot Control using Probability Density Estimations

The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algorithms to work. The well known curse of dimensionality makes infeasible using a tabular representation of the value function, which is the classica...

متن کامل

Parametric Return Density Estimation for Reinforcement Learning

Most conventional Reinforcement Learning (RL) algorithms aim to optimize decisionmaking rules in terms of the expected returns. However, especially for risk management purposes, other risk-sensitive criteria such as the value-at-risk or the expected shortfall are sometimes preferred in real applications. Here, we describe a parametric method for estimating density of the returns, which allows u...

متن کامل

Online Probability Density Estimation of Nonstationary Random Signal using Dynamic Bayesian Networks 109 Online Probability Density Estimation of Nonstationary Random Signal using Dynamic Bayesian Networks

We present two estimators for discrete non-Gaussian and nonstationary probability density estimation based on a dynamic Bayesian network (DBN). The first estimator is for offline computation and consists of a DBN whose transition distribution is represented in terms of kernel functions. The estimator parameters are the weights and shifts of the kernel functions. The parameters are determined th...

متن کامل

Probability Density Estimation of the Q Function for Reinforcement Learning IRI Technical Report

Performing Q-Learning in continuous state-action spaces is a problem still unsolved for many complex applications. The Q function may be rather complex and can not be expected to fit into a predefined parametric model. In addition, the function approximation must be able to cope with the high non-stationarity of the estimated q values, the on-line nature of the learning with a strongly biased s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Computation

سال: 2017

ISSN: 0899-7667,1530-888X

DOI: 10.1162/neco_a_00906